UBER Analytics

illustrations illustrations illustrations illustrations illustrations illustrations illustrations
UBER Analytics

Date

Sep 18, 2020

Categories

Transport

The first step is to important all of the data.

Then because we have very messy and unformatted data - particularly the dates and times - we need to clean this up such that we can process the data correctly. Therefore, we use lubridate to convert out data into an understandable format.

## # A tibble: 4,534,327 x 9
##    Date_Time        Date       Year  Month Day   DayOfWeek   Lat   Lon Base  
##    <chr>            <fct>      <fct> <fct> <fct> <ord>     <dbl> <dbl> <chr> 
##  1 04/01/2014 00:11 2014-04-01 2014  April 01    Tue        40.8 -74.0 B02512
##  2 04/01/2014 00:17 2014-04-01 2014  April 01    Tue        40.7 -74.0 B02512
##  3 04/01/2014 00:21 2014-04-01 2014  April 01    Tue        40.7 -74.0 B02512
##  4 04/01/2014 00:28 2014-04-01 2014  April 01    Tue        40.8 -74.0 B02512
##  5 04/01/2014 00:33 2014-04-01 2014  April 01    Tue        40.8 -74.0 B02512
##  6 04/01/2014 00:33 2014-04-01 2014  April 01    Tue        40.7 -74.0 B02512
##  7 04/01/2014 00:39 2014-04-01 2014  April 01    Tue        40.7 -74.0 B02512
##  8 04/01/2014 00:45 2014-04-01 2014  April 01    Tue        40.8 -74.0 B02512
##  9 04/01/2014 00:55 2014-04-01 2014  April 01    Tue        40.8 -74.0 B02512
## 10 04/01/2014 01:01 2014-04-01 2014  April 01    Tue        40.8 -74.0 B02512
## # ... with 4,534,317 more rows

Now we want to see how the number of trips made by Uber varies throughout the week. Therefore, we find make a wonderful data displaying this.

Unfortunately, number arent as easy to digest, so lets plot this on a graph.

Alright, but its still kind of hard to understand when its busiest for Uber, so lets try this out…

## `summarise()` regrouping output by 'DayOfWeek' (override with `.groups` argument)

library(leaflet)
library(leaflet.extras)

set.seed(1234)
plot_data <- data_2014 %>% 
  sample_frac(0.1, replace = FALSE)

leaflet(data = plot_data) %>% 
  
  addProviderTiles("OpenStreetMap.Mapnik") %>% 
  
  addCircleMarkers(lng = ~Lon, 
                   lat = ~Lat,
                   clusterOptions = markerClusterOptions()
                   ) %>% 
  addHeatmap(lng = ~Lon, 
             lat = ~Lat,
             radius = 15,
             blur = 40,
             cellSize = 30
             )